Policy Feedback for the Refinement of Learned Motion Control on a Mobile Robot

نویسندگان

  • Brenna Argall
  • Brett Browning
  • Manuela M. Veloso
چکیده

Motion control is fundamental to mobile robots, and the associated challenge in development can be assisted by the incorporation of execution experience to increase policy robustness. In this work, we present an approach that updates a policy learned from demonstration with human teacher feedback. We contribute advice-operators as a feedback form that provides corrections on state-action pairs produced during a learner execution, and Focused Feedback for Mobile Robot Policies (F3MRP) as a framework for providing feedback to rapidly-sampled policies. Both are appropriate for mobile robot motion control domains. We present a general feedback algorithm in which multiple types of feedback, including advice-operators, are provided through the F3MRP framework, and shown to improve policies initially derived from a set of behavior examples. A comparison to providing more behavior examples instead of more feedback finds data to be generated in different areas of the state and action spaces, and feedback to be more effective at improving policy performance while producing smaller datasets. B.D. Argall ( ) Depts. of Electrical Engineering & Computer Science and Physical Medicine & Rehabilitation, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA e-mail: [email protected] B. Browning The Robotics Institute, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA e-mail: [email protected] M.M. Veloso Computer Science Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA e-mail: [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Load Carrying Capacity of Mobile-Base Flexible-Link Manipulators: Feedback Linearization Control Approach

This paper focuses on the effects of closed- control on the calculation of the dynamic load carrying capacity (DLCC) for mobile-base flexible-link manipulators. In previously proposed methods in the literature of DLCC calculation in flexible robots, an open-loop control scheme is assumed, whereas in reality, robot control is achieved via closed loop approaches which could render the calculated ...

متن کامل

Mobile Robot Motion Control from Demonstration and Corrective Feedback

Robust motion control algorithms are fundamental to the successful, autonomous operation of mobile robots. Motion control is known to be a difficult problem, and is often dictated by a policy, or state-action mapping. In this chapter, we present an approach for the refinement of mobile robot motion control policies, that incorporates corrective feedback from a human teacher. The target applicat...

متن کامل

Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot

Task demonstration is an effective technique for developing robot motion control policies. As tasks becomemore complex, however, demonstration can becomemore difficult. In this work, we introduce an algorithm that uses corrective human feedback to build a policy able to performanovel task, by combining simpler policies learned from demonstration. While some demonstration-based learning approach...

متن کامل

Direct Optimal Motion Planning for Omni-directional Mobile Robots under Limitation on Velocity and Acceleration

This paper describes a low computational direct approach for optimal motion planning and obstacle avoidance of Omni-directional mobile robots within velocity and acceleration constraints on the robot motion. The main purpose of this problem is the minimization of a quadratic cost function while limitation on velocity and acceleration of robot is considered and collision with any obstacle in the...

متن کامل

Learning Mobile Robot Motion Control from Demonstrated Primitives and Human Feedback

Task demonstration is one effective technique for developing robot motion control policies. As tasks become more complex, however, demonstration can become more difficult. In this work we introduce a technique that uses corrective human feedback to build a policy able to perform an undemonstrated task from simpler policies learned from demonstration. Our algorithm first evaluates and corrects t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • I. J. Social Robotics

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2012